Gesture in Automatic Discourse Processing
نویسندگان
چکیده
Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning. My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract. These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features – extracted automatically from video – yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing. Thesis Supervisor: Regina Barzilay Title: Associate Professor Thesis Supervisor: Randall Davis Title: Professor
منابع مشابه
Causal Analysis for Visual Gesture Understanding
We are exploring the use of high-level knowledge about bodies in the visual understanding of gesture. Our hypothesis is that many gestures are metaphorically derived from the motor programs of our everyday interactions with objects and people. For example, many dismissive gestures look like an imaginary object is being brushed or tossed away. At the discourse level, this implicit mass represent...
متن کاملA Conversational Paradigm for Multimodal Human Interaction
We present an alternative to the manipulative and semaphoric gesture recognition paradigms. Human multimodal communicative behaviors form a tightly integrated whole. We present a paradigm multimodal analysis in natural discourse based on a feature decompositive psycholinguistically derived model that permits us to access the underlying structure and intent of multimodal communicative discourse....
متن کاملThe relationship between right hemisphere damage and gesture in spontaneous discourse
Background: The assessment and rehabilitation of acquired neurogenic communication disorders rarely involves a systematic analysis of gesture use. The right cerebral hemisphere has been identified as a possible locus of control for gesture. McNeill’s (McNeill & Duncan, 2000) growth point theory posits a structure for the organisation of processes from both cerebral hemispheres which serves to s...
متن کاملAutomatic Disambiguation Of Discourse Particles
In spite of their important quantitative role, discourse particles have so far been neglected in automatic speech processing for two reasons: Firstly it is not clear what they may contribute to the aims of automatic speech processing, and secondly their functions seem to vary so much that it seems difficult to identify the information relevant to such aims. The approach presented here therefore...
متن کاملHand Gestures Classification with Multi-Core DTW
Classifications of several gesture types are very helpful in several applications. This paper tries to address fast classifications of hand gestures using DTW over multi-core simple processors. We presented a methodology to distribute templates over multi-cores and then allow parallel execution of the classification. The results were presented to voting algorithm in which the majority vote was ...
متن کامل